Conversation
- Add write_rows() for bulk grid writes at arbitrary positions - Add fast bulk iter_rows(values_only=True) via single Rust FFI call - Add Rust backend support for write_sheet_values() and read_all_values() - Add sheet rename_sheet() support in Rust writer backend - Update README with batch API documentation and SynthGL case study - Add comprehensive openpyxl compatibility test suite (379 lines) - Bump version from 0.1.2 to 0.2.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This pull request adds high-performance batch read/write APIs to wolfxl and bumps the version from 0.1.2 to 0.2.0. The changes introduce three key optimizations: bulk write via write_rows() (3.7x faster), bulk read via iter_rows(values_only=True) (6.7x faster), and sheet renaming support in the Rust backend.
Changes:
- Added
write_rows()API for bulk grid writes at arbitrary positions - Added fast path for
iter_rows(values_only=True)using single Rust FFI call - Added
rename_sheet()support in Rust writer backend with automatic title setter sync - Bumped version to 0.2.0 and added
formulasas optional dependency for calc feature
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Added new dependencies (formulas, numpy, scipy, regex, etc.) for calc extra |
| pyproject.toml | Version bump to 0.2.0 and added optional calc dependencies |
| Cargo.toml | Version bump to 0.2.0 |
| python/wolfxl/_worksheet.py | Added write_rows(), bulk iter_rows path, title setter Rust sync, bulk writes buffer |
| src/rust_xlsxwriter_backend.rs | Added rename_sheet() method with comprehensive re-keying of all data structures |
| src/calamine_styled_backend.rs | Added read_sheet_values_plain() for direct Python value conversion |
| tests/test_wolfxl_compat.py | Added 383 new test lines covering rename, bulk read, write_rows, plain read |
| README.md | Added batch API documentation and performance comparison table |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
python/wolfxl/_worksheet.py
Outdated
| indiv_from_bulk: list[tuple[int, int, Any]] = [] | ||
| for ri, row in enumerate(grid): | ||
| for ci, val in enumerate(row): | ||
| if val is not None and ( | ||
| isinstance(val, bool) | ||
| or (isinstance(val, str) and val.startswith("=")) | ||
| or not isinstance(val, (int, float, str)) | ||
| ): | ||
| indiv_from_bulk.append((sr + ri, sc + ci, val)) | ||
| row[ci] = None |
There was a problem hiding this comment.
There's a code duplication issue here. Lines 369-378 (bulk writes scan) are nearly identical to lines 346-354 (append buffer scan). Both loops:
- Check the same conditions for non-batchable values
- Collect indices in a list
- Replace with None in the grid
Consider extracting this into a helper method to avoid duplication and improve maintainability. For example:
def _extract_non_batchable(self, grid: list[list[Any]], start_row: int, start_col: int) -> list[tuple[int, int, Any]]:
"""Extract non-batchable values from grid, replacing them with None."""
indiv: list[tuple[int, int, Any]] = []
for ri, row in enumerate(grid):
for ci, val in enumerate(row):
if val is not None and (
isinstance(val, bool)
or (isinstance(val, str) and val.startswith("="))
or not isinstance(val, (int, float, str))
):
indiv.append((start_row + ri, start_col + ci, val))
row[ci] = None
return indivThere was a problem hiding this comment.
Fixed in a56f0fc. Extracted _extract_non_batchable() static method that both the append buffer and bulk write scanning now call.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: de8ef7eb78
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| self._append_buffer = [] | ||
|
|
||
| # -- Flush bulk writes (write_rows) ----------------------------------- | ||
| for grid, sr, sc in self._bulk_writes: |
There was a problem hiding this comment.
Flush write_rows queue when saving in modify mode
write_rows() now queues grids into _bulk_writes, but that queue is only consumed in the writer-specific flush loop shown here. In load_workbook(..., modify=True) flows, _flush() uses the patcher path and never drains _bulk_writes, so ws.write_rows(...) changes are silently dropped on save unless each target cell is also written through the dirty-cell path.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in a56f0fc. Added _materialize_bulk_writes() method that converts bulk write buffers into Cell objects. The patcher flush path now calls it (alongside the existing _materialize_append_buffer()) before processing dirty cells. This ensures write_rows() data is flushed in modify mode.
python/wolfxl/_worksheet.py
Outdated
| or not isinstance(val, (int, float, str)) | ||
| ): | ||
| indiv_from_bulk.append((sr + ri, sc + ci, val)) | ||
| row[ci] = None |
There was a problem hiding this comment.
Avoid mutating caller-provided grids during bulk flush
The bulk write flush mutates the original grid in place via row[ci] = None for non-batchable values. Because write_rows() stores the caller's list object directly, this side effect leaks out after save() and can corrupt results when the same grid is reused in multiple write_rows() calls (later writes lose booleans/formulas/dates that were nulled during the first flush).
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in a56f0fc. write_rows() now stores a shallow copy of the grid ([list(row) for row in rows]) so that the non-batchable extraction in flush can safely mutate row elements without affecting the caller's original lists.
src/calamine_styled_backend.rs
Outdated
| let s = ndt.date().format("%Y-%m-%d").to_string(); | ||
| Ok(s.to_object(py)) | ||
| } else { | ||
| let s = ndt.format("%Y-%m-%dT%H:%M:%S").to_string(); | ||
| Ok(s.to_object(py)) |
There was a problem hiding this comment.
Preserve date/datetime types in values_only bulk reads
The new plain bulk conversion formats Data::DateTime as strings and returns them directly, so the fast iter_rows(values_only=True) path now yields string timestamps instead of the date/datetime objects produced by the existing payload conversion path. This breaks callers that rely on typed date values for comparisons or arithmetic when bulk reads are enabled.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in a56f0fc. Updated data_to_plain_py() in Rust to return proper Python datetime.date / datetime.datetime objects using PyO3's PyDate::new() / PyDateTime::new() constructors. Also handles DateTimeIso values with proper midnight detection (returns date vs datetime). This matches the typed behavior of the non-bulk _payload_to_python() path.
1. Extract _extract_non_batchable() helper to eliminate code duplication between append buffer and bulk write scanning (Copilot). 2. Drain _bulk_writes in modify mode by materializing into Cell objects before patcher flush, preventing silent data loss (Codex P1). 3. Copy grids in write_rows() to avoid mutating caller-provided lists during flush (Codex P2). 4. Return typed Python date/datetime objects from data_to_plain_py() instead of ISO strings, matching the non-bulk read path behavior. Also handle DateTimeIso values with proper type detection (Codex P2). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- POWER: return #NUM! for negative base with fractional exponent instead of leaking complex numbers (Copilot #2, Codex #11) - LEFT: return #VALUE! for negative num_chars (Copilot #1) - MID: return #VALUE! for start < 1 or num_chars < 0 (Copilot #8) - range_shape: use min/max instead of abs() for consistency with expand_range (Copilot #3) - Add @_requires_formulas skip marker to formulas-dependent test classes so CI passes without wolfxl[calc] extra (Codex #10) - Add 4 edge case tests: POWER(-1,0.5), LEFT(-1), MID(0,..), MID(.,-1) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
write_rows()for bulk grid writes at arbitrary positions, bypassing per-cell FFI overheaditer_rows(values_only=True)bulk read path via single Rust FFI call (6.7x faster than openpyxl)write_sheet_values()andread_all_values()in both calamine and xlsxwriter backendsrename_sheet()support in Rust writer backendPerformance
ws.append(row)ws.write_rows(grid)ws.iter_rows(values_only=True)Test plan
🤖 Generated with Claude Code